Learning Rational Stochastic Tree Languages
نویسندگان
چکیده
We consider the problem of learning stochastic tree languages, i.e. probability distributions over a set of trees T (F), from a sample of trees independently drawn according to an unknown target P . We consider the case where the target is a rational stochastic tree language, i.e. it can be computed by a rational tree series or, equivalently, by a multiplicity tree automaton. In this paper, we provide two contributions. First, we show that rational tree series admit a canonical representation with parameters that can be efficiently estimated from samples. Then, we give an inference algorithm that identifies the class of rational stochastic tree languages in the limit with probability one.
منابع مشابه
Relevant Representations for the Inference of Rational Stochastic Tree Languages
Recently, an algorithm DEESwas proposed for learning rational stochastic tree languages. Given a sample of trees independently and identically drawn according to a distribution de ned by a rational stochastic language, DEES outputs a linear representation of a rational series which converges to the target. DEES can then be used to identify in the limit with probability one rational stochastic t...
متن کاملA probabilistic extension of locally testable tree languages
Probabilistic k-testable models (usually known as k-gram models in the case of strings) can be easily identified from samples and allow for smoothing techniques to deal with unseen events. In this paper we introduce the family of stochastic k-testable tree languages and describe how these models can approximate any stochastic rational tree language. This is applied, as a particular case, to the...
متن کاملLearning Rational Stochastic Languages
Given a finite set of words w1, . . . , wn independently drawn according to a fixed unknown distribution law P called a stochastic language, an usual goal in Grammatical Inference is to infer an estimate of P in some class of probabilistic models, such as Probabilistic Automata (PA). Here, we study the class S R (Σ) of rational stochastic languages, which consists in stochastic languages that c...
متن کاملRational stochastic languages
The goal of the present paper is to provide a systematic and comprehensive study of rational stochastic languages over a semiring K ∈ {Q,Q,R,R}. A rational stochastic language is a probability distribution over a free monoid Σ which is rational over K, that is which can be generated by a multiplicity automata with parameters in K. We study the relations between the classes of rational stochasti...
متن کاملLearning context-free grammars from stochastic structural information
We consider the problem of learning context-free grammars from stochastic structural data. For this purpose, we have developed an algorithm (tlips) which identiies any rational tree set from stochastic samples and approximates the probability distribution of the trees in the language. The procedure identiies equivalent subtrees in the sample and outputs the hypothesis in linear time with the nu...
متن کامل